PAX: Partition-Aware Autoscaling for the Cassandra NoSQL Database

نویسندگان

  • Salvatore Dipietro
  • Rajkumar Buyya
چکیده

Apache Cassandra has emerged as one of the most widely adopted NoSQL databases. However, there is still a limited understanding on how to optimally operate Cassandra in the cloud using autoscaling methods, by which resources can be scaled up or down to reduce operational costs and meet servicelevel objectives (SLOs). To address this limitation, we present PAX, a partition-aware elastic resource management system for Apache Cassandra. PAX uses low-overhead query sampling and knowledge of the datapartitioning across the nodes to automatically adapt capacity in Cassandra clusters. Differently from existing autoscaling methods for Cassandra, which incur large acquisition times for new nodes, PAX exploits Cassandra’s hinted handoff mechanism and a shared hints storage to minimize the time needed to acquire a node into the cluster. We propose a reactive and a proactive implementation of PAX and compare their performance against different workloads with varying intensities and item popularity distributions, finding that the proactive version significantly reduces SLO violations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NoSQL Databases

In this document, I present the main notions of NoSQL databases and compare four selected products (Riak, MongoDB, Cassandra, Neo4J) according to their capabilities with respect to consistency, availability, and partition tolerance, as well as performance. I also propose a few criteria for selecting the right tool for the right situation.

متن کامل

Searchable Encryption in Apache Cassandra

In today’s cloud computing applications it is common practice for clients to outsource their data to cloud storage providers. That data may contain sensitive information, which the client wishes to protect against this untrustworthy environment. Confidentiality can be preserved by the use of encryption. Unfortunately that makes it difficult to perform efficient searches. There are a couple of d...

متن کامل

Improving Kieker′s Scalability by Employing Linked Read- Optimized and Write-Optimized NoSQL Storage

Kieker’s monitoring output can be persistently saved into logs by utilizing relational databases or file systems. Currently, there is no support for noSQL storage. As part of our Regression Benchmarking Execution Environment (RBEE) we introduce a selfcontained system offering noSQL storage capabilities and acting as gateway between Kieker and RBEE. We show, how polyglot persistence can increase...

متن کامل

Dynamic Workload-Aware Elastic Scale-Out in Cloud Data Stores

NoSQL databases store a huge amount of data generated by modern web applications. To improve scalability, a database is partitioned and distributed among the different nodes called as a scale out. However, this scale out feature of the NoSQL database is oblivious to the data access pattern of the web applications, which results in poorly distributed data across all the nodes. Therefore, the cos...

متن کامل

Optimal adaption for Apache Cassandra

Apache Cassandra is a NoSql database offering high scalability and availability. Among with its competitors, e.g. Hbase, SympleDB and BigTable, Cassandra is a widely used platform for big data systems. Tuning the performance of those systems is a complex task and there is a growing demand for autonomic management solutions. In this paper we present an energy-aware adaptation model built from a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017